Fix/issue 3311#3313
Conversation
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Reviewed by Cursor Bugbot for commit f9228b9. Configure here.
|
@aartisonigra thanks, i will have a look! |
There was a problem hiding this comment.
Thanks for tracking down #3311 — the diagnosis is right (the cluster resubscribe never fires on in-place slot migration), but I'd like to land it differently.
- server-sunsubscribe is a stale event name, the right one is sharded-channel-moved
So instead of emitting a second event from the queue, we should just point the cluster listener at the event we already emit. That keeps a single, documented event and avoids touching the queue at all.
-
The index.ts change breaks constructor arity. The queue needs no change at all, so please revert the index.ts edit entirely.
-
Please drop the FT.SEARCH changes.
-
Regression test. The existing tests never exercise the resubscribe path. Here's one that subscribes first, then migrates in place — it fails on the dead listener and passes with the fix:
// Regression for #3311: subscribe FIRST, then migrate the slot in place.
// The server pushes SUNSUBSCRIBE to the already-subscribed client, which
// must drive the cluster to rediscover and reattach the listener on the
// new owner. The dead `server-sunsubscribe` listener meant this never
// happened and the subscription was silently lost after migration.
testUtils.testWithCluster('should resubscribe a sharded channel after in-place slot migration (#3311)', async cluster => {
const SLOT = 10328, // slot of `channel`
migrating = cluster.slots[SLOT].master,
importing = cluster.masters.find(master => master !== migrating)!,
[migratingClient, importingClient] = await Promise.all([
cluster.nodeClient(migrating),
cluster.nodeClient(importing)
]);
const listener = spy();
// subscribe BEFORE migration -> the sharded PubSub client attaches to `migrating`
await cluster.sSubscribe('channel', listener);
// move the slot in-place to `importing`; `migrating` loses the slot and
// pushes SUNSUBSCRIBE to the subscribed client
await Promise.all([
migratingClient.clusterDelSlots(SLOT),
importingClient.clusterDelSlots(SLOT),
importingClient.clusterAddSlots(SLOT),
migratingClient.clusterSetSlot(SLOT, 'NODE', importing.id),
importingClient.clusterSetSlot(SLOT, 'NODE', importing.id)
]);
// the reattach is async and sharded PubSub does not buffer, so
// publish until the resubscribed listener receives the message.
// With the bug this never reattaches and the loop exhausts -> assertion fails.
for (let i = 0; i < 50 && !listener.called; i++) {
await cluster.sPublish('channel', 'message');
await setTimeout(100);
}
assert.ok(listener.calledWithExactly('message', 'channel'));
}, {
serverArguments: [],
minimumDockerVersion: [7]
});Requires import { setTimeout } from 'node:timers/promises'; at the top of the spec.
Net change: ~2 lines in cluster-slots.ts + the test. No queue/index.ts churn, no RediSearch change.
There was a problem hiding this comment.
why is the search part of this PR?

Fixes #3311
This change resolves a sharded Pub/Sub resubscription issue during in-place slot migrations in Redis Cluster.
Changes
Align sharded Pub/Sub migration event handling.
Ensure resubscribe and rediscover logic is triggered when a sharded channel is moved.
Fix typo in the emitted error event name:
sharded-shannel-moved-error → sharded-channel-moved-error
Note
Medium Risk
Touches cluster sharded Pub/Sub migration and topology rediscovery; behavior change is targeted but affects live slot moves where subscriptions must survive.
Overview
Fixes #3311: sharded Pub/Sub subscriptions could be silently dropped when a slot moved while the client was already subscribed.
The cluster sharded Pub/Sub connection now listens for
sharded-channel-moved(emitted from the node client when the server pushesSUNSUBSCRIBE) instead of the non-existentserver-sunsubscribehandler. On that event it rediscovers topology, opens the sharded client on the new slot owner, and reattaches listeners viaextendPubSubChannelListeners. Failure paths now emitsharded-channel-moved-error(typo fix fromsharded-shannel-moved-error).Adds an integration test that subscribes before migrating the channel’s slot in place, then publishes until the listener receives a message again.
Reviewed by Cursor Bugbot for commit 2b01ecf. Bugbot is set up for automated code reviews on this repo. Configure here.